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Abstract 

Jet physics is a rich and rapidly evolving field, with many applications to 
physics in and beyond the Standard Model. These notes, based on lectures de- 
livered at the June 2012 Theoretical Advanced Study Institute, provide an intro- 
duction to jets at the Large Hadron Collider. Topics covered include sequential 
jet algorithms, jet shapes, jet grooming, and boosted Higgs and top tagging. 
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1 Introduction 

These notes are writeups of three lectures delivered at the Theoretical Advanced Study 
Institute in Boulder, Colorado, in June 2012. The aim of the lectures is to provide 
students who have little or no experience with jets with the basic concepts and tools 
needed to engage with the rapidly developing ideas concerning the use of jets in new 
physics searches at the LHC. A certain amount of familiarity with the structure of QCD, 
and in particular with QCD showers, is assumed. 

Lecture one introduces sequential jet algorithms, and develops several main tools in 
substructure analyses using the boosted Higgs as an example. Lecture two delves further 
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into jet grooming and jet shapes, and in lecture three we conclude with an overview of 
top tagging and BSM searches. 



2 Lecture I: Jets, Subjets, and Sequential Jet Algo- 
rithms 

To understand jet substructure and its apphcations, we must first begin by understand- 
ing jets. Jets, together with parton distribution functions and factorization theorems, 
are the phenomenological tool that allow us to separate out the perturbatively describ- 
able hard interactions in proton-proton collisions, and thereby enable us to make quan- 
titative predictions for events involving strongly interacting particles. Jet cross-sections 
necessarily depend on the algorithm used to define a jet. There are many jet algorithms, 
each one with its own strengths and weaknesses. 

The first jet algorithm was developed for e'^e^ — ?■ hadron events by Sterman and 
Weinberg in 1977 [IJ. In this algorithm events are declared to have two jets if all but a 
fraction e of the total energy in the event can be contained within two cones of half-angle 
5. That is, radiation off of one of the initial partons must be sufficiently hard, 

Erad > e (2.1) 

and at sufficiently wide angles from either of the other jets, 

Omin > 5 (2.2) 

for the radiation to be resolved as a separate jet. How many events have two jets and 
how many contain three or more obviously depends on the exact values chosen for e 
and 6. For all sufficiently large e/Efot and 6, the partonic cross-section for radiation 



of an extra parton into the region of phase space defined by Eqs. 2A_ and 2^ is suffi- 
ciently isolated from the soft and collinear singular regions of phase space that rates 
and distributions can be calculated reliably in perturbation theory. Of course, this is a 
partonic calculation, and to fully match the partonic picture onto reconstructed sprays 
of hadrons requires some additional theoretical machinery to describe such effects as 
(for example) hadronization. For our purposes, however, a parton shower picture will 
suffice. 

The Sterman- Weinberg algorithm is the ur-example of a cone algorithm. While cone 
algorithms present a very intuitive picture of parton radiation, they can be somewhat 
clumsy in practice, particularly as the number of jets increases, and they are not in active 
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use in most experiments today. Other algorithms can deal much more flexibly with high 
jet multiplicity. One such flexible algorithm is the JADE algorithm, developed by the 
JADE collaboration in the late 1980s, also for e^e" — ?■ hadrons [21 E]. Here, jets are 
constructed by iteratively recombining flnal state particles. Deflne a metric to measure 
the separation between flnal state particles i and j, 



^ 2EiEj{l - cos^ Oij) 



- Q2 ~ ' g2 (2-3) 

where Q is the total energy of the event. Note that yij vanishes if either i or j is soft 
{Ei — )■ or Ej — 7- 0), or if z and j are coUinear (cos% — 1). We can now construct jets 
using the following recipe: 

• Compute the interparticle distances i/ij for all particles in the flnal state, and flnd 
the pair {i,j} with the minimum i/ij. 

• If this minimum i/ij < uq for some flxed parameter uq, combine i and j into a new 
particle, and go back to the previous step. 

• If > Ho, declare all remaining particles to be jets. 

Since clustering of particles proceeds from smaller values of yij to larger values, this 
recipe preferentially clusters particles that are probing the regions of phase space dom- 
inated by the soft and coUinear singularities. In a sense, the algorithm is trying to 
combine hadrons into partons by making its best guess for the reconstructed parton 
shower. The JADE algorithm has only one parameter, the separation cutoff yo, and 
clearly can handle different jet multiplicities in an efficient way by varying y^. It is the 
ur-example of a sequential recombination algorithm, and the ancestor of all jet algorithms 
in wide use at the LHC. 

The most direct descendent of the JADE algorithm is the algorithm which 
replaces the particle energy factor EiEj in the Jade metric, Eq. 2.3, with the factor 
mm{E!,E]y. 

2mm{Ef,E]){l-cos'^ %) 



y, = ----^^-^-^ — ^^ (2.4) 



This still ensures that the metric goes to zero when either i?j — )■ or Ej — )• are soft, but 
has the advantage that the relative softness of a particle depends only on its own energy, 
and not that of the other particle in the pair. This fixes up a technical drawback to the 
JADE algorithm, where y^j^^^^ oc E^Ej allows two very soft particles to be combined 
even if they are at very wide angles from each other. Using yij oc mm{Ef, Ej) means 
soft particles will get preferentially clustered with nearby harder particles instead. 
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For small 6ij, the numerator of Eq. 2.4 can be written as simply k'j_, the transverse 



momentum of the softer particle relative to the harder particle — hence the name of the 
algorithm. In this form the metric is directly related to QCD splitting functions. 

To create a version of the kx algorithm that can be used at hadron colliders, where 
the total energy is unknown, both the algorithm and the metric have to be adapted 
[5]. In the metric, we simply use longitudinally boost-invariant quantities px and AR 
instead of E and cos 6ij, and let the metric become dimensionful. 

The angular parameter R introduced here will replace yo as determining the cutoff for 
combining particles, as we will see. We need in addition to define the quantities 

diB = PT,i (2-6) 

for each particle i, since we need to also consider splittings from the beam. 
The recombination algorithm now works as follows: 

• Compute dij and dis for all particles in the final state, and find the minimum 
value. 

• If the minimum is a diB, declare particle i a jet, remove it from the list, and go 
back to step one. 

• If the minimum is a dij, combine particles i and j, and go back to step one. 

• Iterate until all particles have been declared jets. 

This algorithm is usually what is meant by when the algorithm is referred to, but 
you may occasionally see it referred to as the inclusive kx algorithm, as there is a related 
{'^exclusive") variant [6]. Note that the parameter R functions as an angular cut-off: 
two particles separated by a distance Rij > R will never be combined, regardless of the 
Pt's of the particles (this does not necessarily preclude both particles being clustered 
into the same jet later). In fact, with this jet algorithm, arbitrarily soft particles can 
become jets. Therefore jets are customarily returned down to some finite pt cutoff, 
typically tens of GeV. 

Because the kj- algorithm clusters particles beginning with soft particles and working 
its way up to harder particles, the algorithm tends to construct irregular jets which 
depend on the detailed distribution of soft particles in an event. For this reason, fc^ jets 
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are not especially practical for hadron colliders: irregular jets are hard to calibrate, and 
the jets are quite sensitive to unrelated radiation in the event. 

Other sequential algorithms are obtained by using different metrics. The Cambridge- 
Aachen or C-A algorithm is obtained by taking [7] 

= = 1- (2-7) 

This metric clusters particles based only on their angular separation, giving a nicely 
geometric interpretation of jets. The C-A algorithm still reflects aspects of the QCD 
parton shower, in particular the angular ordering of emissions. However, it is less directly 
related to the structure of QCD parton splitting functions than the kx algorithm is, and 
represents a compromise between reflecting the structure of the parton shower and 
maintaining some insensitivity to soft radiation. 

The anti-kx algorithm entirely abandons the idea of mimicking the parton shower 
[8]. Here, the metric is 

( ^ 1 ^ ,1 

dij = mm — , ^— diB = —■ (2.8) 



PT,i Ptj I R Pt. 



With this metric, particles are clustered beginning with the hardest particles. This 
means that the most energetic cores of jets are found first. As soft particles clustered 
later have a minimal impact on the larger four-momentum of the jet core, the anti-^T 
algorithm tends to cluster particles out to distances R from the core of a jet, yielding 
very regular jets. Anti-^T jets are therefore much easier to calibrate at experiments, 
and the anti-fcr algorithm has become the default used at the LHC. 

Let us conclude this section by emphasizing that all sequential jet algorithms re- 
turn not only a list of jets but a clustering sequence for the event. Varying the radial 
parameter R simply acts to move the resolution scale up and down the clustering se- 
quence, making it very easy to study how jet distributions and multiplicities depend 
on the angular resolution R. In particular, for the C-A algorithm, the cluster sequence 
regarded as a function of R has a purely geometric interpretation as resolving the event 
on different angular scales. 

All three sequential jet algorithms discussed here also share the same reach, that is, 
regardless of the chosen metric, a splitting P — )■ ij will not be combined if the angular 
distance between the daughters exceeds the chosen jet radius, ARij > R. This means 
that, to leading order, perturbative computations of quantities such as jet rates are 
identical between all three algorithms. 
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Finally, the infrared and collinear safety of all three sequential jet algorithms can 
be easily checked by asking how the cluster sequence would change with the addition 
of a soft or collinear emission. For the shower-sensitive kx and C-A metrics, infrared 
and collinear safety follows automatically. The anti-^T metric is also manifestly IR- and 
collinear-safe, as can be seen with a little more thought: anti-fcr recombinations are 
clearly collinear-safe, since collinear splittings are combined near the beginning of the 
sequence. IR safety also follows, as soft radiation has negligible impact on the jet built 
out from the hard core. 

2.1 Jets at the LHC 

The main subject of these lectures are the possibilities and uses of jets to discover physics 
at and beyond the electroweak scale, which means, for practical purposes, at the LHC. 

It is important to remember that events at LHC are a busy hadronic environment. 
In addition to the showering and hadronizing hard partons which we want to study, 
there are large amounts of soft, unassociated radiation from (1) the underlying event, 
that is, the remnants of the scattering protons; (2) possible multiple interactions, that is, 
additional collisions of partons arising from the same p-p collision as the hard interaction; 
and (3) pile-up, additional p-p collisions from other protons in the colliding bunches. 
These additional sources of radiation contribute a potentially sizable and largely uniform 
backdrop of hadronic activity that, when clustered into jets, will partially obscure the 
features of the hard interaction that we would like to reconstruct. 

The default jets used at the LHC are formed using the anti-fc^ algorithm, with cone 
sizes R = 0.4,0.6 (at ATLAS) and R = 0.5,0.7 (at CMS). These specific choices of R 
come from a compromise between (1) the desire to collect all the radiation from a single 
parton, and (2) the desire not to sweep up an excessive amount of unrelated radiation. 

Many advances have combined to make jets at the LHC a particularly fertile field. 

• advances in experiment: the calorimeters at ATLAS and CMS have much finer 
resolution than in previous experiments, allowing a much more finely grained pic- 
ture of events. Moreover, local calibration of jets allows jets to be considered on 
multiple scales. 

• advances in computation: the development of fast algorithms [9] allows broad 
implementation of sequential recombination. 

• advances in energy: the LHC center of mass energy is large enough that par- 
ticles with weak scale masses (i.e., Z,W,t, and H) will for the first time have 
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an appreciable cross-section to be produced with enough of a boost to colhmate 
the daughter partons. The simple picture that one parton corresponds to one 
jet breaks down badly in this case, and new tools are needed to separate out 
coUimated perturbative decays from QCD showers. 

There are several reasons to be interested in boosted particles. Very often, there is 
theoretical motivation to focus on a particular slice of phase space where the daughter 
particles are necessarily boosted. High mass resonances are the simplest such examples. 
For instance, a resonance pc with mass nip > 1.5 TeV which decays to pairs of gauge 
bosons would yield highly boosted VV pairs. 

Even in the absence of a resonance or other mechanism to preferentially populate 
boosted regions of phase space, looking for boosted signals can also be useful for improv- 
ing the signal to background ratio. Changing the reconstruction method changes what 
the experimental definition of the signal is, and therefore necessarily the backgrounds 
change as well. This can sometimes — but not always! — be enough of an advantage to 
make up for the reduction in signal rate that comes from selecting only the boosted re- 
gion of phase space. Background reduction comes in two forms. In high multiplicity final 
states, combinatoric background is often prohibitive. When some or all of the final state 
particles are boosted, the combinatoric background is greatly reduced. But it is also 
possible to use boosted selection techniques to identify regions where the background 
from other physics processes is intrinsically reduced. 

To appreciate the need for new reconstruction techniques at the LHC, consider the 
production of top quarks at fixed center of mass energy V^. Choosing some angular 
scale Rq, we can ask, what fraction of top quarks have all three, only two, or none of 
their partonic daughters isolated from the others at the scale i?o? This gives a zeroth 
order estimate of how well a jet algorithm with R — Rq will be able to reconstruct the 
three partonic top daughters as separate jets. The answer we get depends sensitively 
on both Rq and V^: 





Rq 


3 


2 


1 


1.5 TeV 


0.4 


0.55 


0.45 




1.5 TeV 


0.6 


0.2 


0.6 


0.2 


2.0 TeV 


0.6 


0.1 


0.45 


0.45 



Table 1: Resolved parton multiphcities in tt events 
Clearly, tops produced in the very interesting super-TeV regime > TeV straddle 
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the borderlines between several different topologies. It would be much more desirable 
to have a ffexible reconstruction method that could handle semi-coUimated tops in a 
unified way. 

To see how we can go about building such reconstruction techniques, let's start by 
considering one of the landmark jet substructure analyses: the case of a boosted Higgs 
decaying into bb. 

2.2 Boosted Higgs 

This analysis will introduce us to several ideas that will be important tools in our 
boosted analysis toolbox: fat jets, jet mass, jet grooming, and sequential de-clustering. 

Searching for the Higgs in its decay to bb is very difficult at the LHC, due to over- 
whelming QCD backgrounds. Even in associated production, pp — > HZ, HW, the back- 
ground processes Z + bb, W + bb, and even tt are overwhelming. Nonetheless, thanks to 
Ref. [TU], pp — > HV, H ^ bb is now an active search channel at the LHC. 

To be specific, let's consider the process pp — HZ, followed hj H ^ bb, and 
Z — 7- i^i^ . The traditional approach to this signal would be to look for final states with 
a leptonic Z and 2 6-tagged jets, construct the invariant mass of the jets, and look for a 
peak in the distribution of mi,i. The new approach is instead to focus on events where 
the Higgs is produced with substantial pr, Pt,h > 200 GeV, and cluster these events 
with a large {R = 1.2) jet radius, such that all of the Higgs decay products are swept 
up in a single fat jet. The signal is now a leptonic Z + a fat "Higgs-like" jet, and the 
background to this signal is now Z+ one fat jet rather than Z + bb. What we'll see 
is that jet substructure offers us enough quantitative precision in what we mean by a 
"Higgs-like" jet to reduce the background by an extent that makes up for the acceptance 
price demanded by the high px cut. 

For an unboosted search, the ultimate discriminator between signal and background 
is the b-b invariant mass: to find a resonance, look for a bump in the b-b mass spectrum. 
Now that we have boosted the Higgs and collected it into a single fat jet, the Higgs mass 
should be reflected in the invariant mass of the fat jet itself. To understand jet masses 
for the background, let's take a quick look at how jet masses are generated in QCD. 

Jet Mass . Partons are generally massless (we will neglect the b quark mass), but jets 
are not. Jet mass in QCD arises from emission during the parton shower, and as such we 
can calculate the leading contribution. Jet mass, like most perturbative jet properties 
in QCD, is dominated by the first emission. Let's consider for concreteness a quark 
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emitting a gluon, and work in the collinear regime (small R). In this approximation, 
we can consider the jet in isolation from the rest of the event, neglecting interference 
and splash-in, and we can approximate the QCD splitting functions with the singular 
portions. Doing so, the amplitude to radiate an extra parton can be written as 

dan+i~ dandz^^V{z), (2.9) 

t ZTT 

where t is the virtuality of the parent P, z = Eg/Ep is the fraction of the parent energy 
retained by the daughter quark, and the splitting function V{z) for q ^ qg is given by 

V{z) = Cf\^. (2.10) 

L — Z 

The parent virtuality t is of course the jet mass-squared. In the collinear limit, 

t = Elz{l - z)9'^ = (pr,p cosh 77)^^(1 - z)9'^. (2.11) 
Integrating over rapidity, we can approximate the average jet mass-squared as: 

{m')^plpj^ j dzz{l-z)e'^V{z). (2.12) 

Note the limits on the 6 integral: this is where the choice of jet algorithm enters. As 
established above, for all sequential jet algorithms, only radiation at angles smaller than 
R will be clustered into the jet. Strictly, we should use a running ctg evaluated at a scale 
set by the relative transverse momentum of the splitting, but to get a quick estimate, 
let's perform the integral in the approximation that ag is constant. We then obtain 

{m^)^^lcFp^TR^. (2.13) 

TT 8 

The jet mass scales like px, as it had to, and is suppressed by (a^/Tr)^/^. To this order 
the mass increases linearly with R. The exact value of the numerical coefficient will in 
general depend on the quark versus gluon content of the jet sample. For instance, the 
major QCD background for a doubly 6-tagged boosted Higgs comes from the splittings 
g ^ bb, where the splitting function is 

Viz)^CA{z' + {l-zf), (2.14) 

giving, in the constant-cts approximation, 

{m')^^^CAp'TR'. (2.15) 
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Coming back to the Higgs, consider now a splitting P — )■ ij. We have ~ 
2pi -pj ~ pT^iPTj^Rij = -2(1 — ^)'Pt,p^-R%- other words, just from kinematics we can 
express the opening angle in terms of the parent mass and pt'- 

m 1 2m 

ARii , . 2.16 

Pt ^yz{l - z) Pt 

Now consider the metric evaluated on this splitting P ^ ij: 

For jets with a fixed mass m, cutting on the splitting scale i/ij then can separate QCD 
jets, which have a soft singularity oc 1/z, from boosted Higgses, which have a fiat 
distribution in 2. Q 

Moreover, a boosted Higgs will go from a mass mu to massless daughters in one step, 
while QCD splittings prefer to shed virtuality gradually. To see this, consider the Su- 
dakov form factor, which exponentiates the splitting functions to obtain the probability 
of evolving from an initial virtuality to to a final virtuality t without branching: 



A{t) = exp 



Jto t' 271 ^ ^ 



(2.18) 



Evaluating = as{t') and using an IR cut-off to regulate the splitting functions, at 
large t, one can work out that [H] 

m « (jj (2.19) 

for an exponent p > 0, in other words, A(t) for large t. In other words, the 
probability of a QCD jet making a large jump in mass at a branching falls off as m~^''|^ 
We have now identified two ways in which a Higgs boson H decaying perturbatively 
to bb will behave very differently from a QCD parton branching: the splitting will be 
symmetric, and show a sudden drop in parton mass. The search algorithm for finding a 
boosted Higgs looks for a splitting inside the Higgs jet that behaves like a perturbative 
decay, and works as follows: 



^This is a little quick: not all QCD splitting functions have a soft singularity, and in particular 
g ^ qq does not. However, Pg^qg(z) is not flat in z, and in particular is minimized at the symmetric 
value z — 1/2, so cutting on Uij can still help suppress this background. 

^In fact, taking higher order corrections into account, one finds that the Sudakov form factor goes 
to zero even faster than polynomially for large t. 
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• Cluster the event on a large angular scale (Ref. [TO] uses = 1.2), using the C-A 
algorithm. Large angular scales are necessary in order to get good acceptance for 
collecting both Higgs decay products into a single fat jet: from Eq. 2.16, we can 
see that the b-b separation for a 125 GeV Higgs boson is -R^j < 1 for pt > 200 
GeV. We choose the C-A algorithm because it is a good compromise between 
accurately reflecting the shower structure of QCD, and minimizing sensitivity to 
soft radiation in the event. 

• Now, given a hard fat jet, successively unwind the jet by undoing the cluster 
sequence one branching at a time. At each branching P ^ ij, check to see whether 
the splitting looks sufficiently non-QCD-like, by asking that the branching be both 
hard, 

max(mj,mj) < fimp (2.20) 
for some parameter /i, and symmetric, 

Vij > Vcut (2.21) 

for some choice of Ucut- 

• If the splitting fails to be sufficiently hard and symmetric, discard the softer of i 
and j, and continue to unwind the harder. 

• Continue until either an interesting splitting has been found or you run out of jet. 

This procedure, often referred to as the "splitting" or "mass-drop" procedure, identifies 
an interesting Higgs-like splitting H — )■ bb, which determines a characteristic angular 
scale Ri,i for a particular event. Once this scale Ri,i has been identified, we benefit 
greatly by using smaller scales to resolve the event, rather than the large R = 1.2 scale 
we started with. 

The reason is the following: starting with such a large jet, we are guaranteed to sweep 
up a large amount of unassociated radiation along with the Higgs decay products. The 
effect of this unassociated radiation is to smear out the mass resolution. The invariant 
mass is especially vulnerable to distortion from even soft unassociated radiation, because 
evaluating = E"^ —fp depends on large cancellations. The amount of distortion scales 
hke 

^ KoftPT,jR\ (2.22) 

in the approximation that unassociated radiation contributes a constant energy hgoft 
per unit rapidity: the jet area scales like R^, while the incremental contribution to 
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the invariant mass from a soft particle at distance R/2 from the jet core contributes 
as ji?/2 [T2]. So to recover mass resolution, it is vital to whittle down our initial 
fat jet to jets only as big as necessary to capture the radiation from the Higgs decay 
products. 

In fact, we have already started whittling. The splitting procedure discards soft, 
wide-angle radiation clustered into the jet on its way towards finding the Higgs-like 
splitting. This by itself helps to clean up the mass resolution. But we can do better: 
given the scale R^i which is our best guess at the angular separation of the Higgs' 
daughter particles, we can resolve the fat jet at the filtering scale Rfm = m.m{R^i/2, 0.3), 
and keep only the three hardest subjets. We keep three, rather than two, subjets in 
order to capture final-state radiation off of one of the b quarks. 

Finally, demanding that the two hardest filtered subjets be 6-tagged, Ref. [TO] finds 
that the Higgs can be seen in this channel with 5a significance in 30 fb^^ (at 14 TeV, 
combining Z — )■ , Z — )■ ui?, and W — )■ iu), and signal-to-background of 0{1). How- 
ever, we emphasize that this and all other LHC phenomenological studies are based on 
expectations from Monte Carlo. Even very sophisticated Monte Carlos necessarily cap- 
ture only an approximation to the full physics of QCD. For this reason, both validation 
in data on one hand and formal theoretical study on the other are critical. Let us then 
end this section by showing a couple of the most important early experimental results. 
In Fig. [l| we show two plots from Ref. [13] . On the left, we see that shower Monte Carlos 
do a reasonable job of predicting the spectrum of jet masses for the QCD background. 
On the right, the jet mass is plotted as a function of the number of primary vertices 
Npv in an event, or in other words, the amount of pileup. Note that after filtering, 
the jet mass has little to no dependence on Npv, indicating that filtering is successfully 
isolating the hard process. Note also that filtering is necessary: prior to filtering, the 
dependence of jet mass on Npy is significant, and in the 2012 operating environment 
average pileup multiplicity is Npy > 30. 

Heartened by this evidence that our theoretical techniques have a reasonable re- 
lationship with reality, we will proceed in the next section to discuss more ideas for 
cleaning up pileup, and more jet properties which can discriminate signals from QCD 
backgrounds. 
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Figure 1: (Left) The distribution of jet mass for fat C-A jets (after splitting and filtering). 
Note the reasonable agreement between data and predictions from two different shower 
MCs. (Right) Average jet mass as a function of the number of primary vertices Npv- 
Note that after filtering, the jet mass has little to no dependence on Npv- From Ref. [13]. 



3 Lecture II: Jet Grooming and Jet Shapes 

Our last section concluded with a walk-through of the pioneering boosted Higgs study, 
where we saw examples of two topics we will be discussing in this lecture, namely jet 
grooming and jet shapes. 

3.1 Jet grooming 

In the boosted Higgs analysis discussed in the previous lecture, we saw that jet mass 
resolution was badly degraded by the presence of unassociated radiation in the jet, 
and introduced the process of filtering to mitigate these contributions. Filtering is 
one of several jet grooming algorithms, all of which are designed to "clean up" jets by 
subtracting the contributions of unassociated radiation. 

Trimming [H] , similarly to filtering, reclusters the constituents of a fat jet and retains 
a subset of the subjets, but has a different criterion for keeping subjets. For each jet of 
interest, the algorithm is: 

• Recluster the constituents using some jet algorithm (the original reference specifies 
^t), and resolve on a fixed small angular scale Rq. 
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• Keep each subjet i that passes a px threshhold, 

PT,i > fAhard (3.23) 

for a cutoff parameter fcut and a hard momentum scale A. 

• The final trimmed jet is the sum of the retained subjets. 

The essential idea is that radiation we want to keep tends to be distributed in clusters, 
reflecting a parent parton emission, while unassociated radiation we don't want to keep 
is more uniformly distributed. Asking that radiation cluster sufficiently on small scales 
then preferentially picks out the radiation which ultimately originated from a parent 
hard parton. The kj- algorithm was originally proposed here because it increases the 
chances that soft FSR will be kept: since clustering in the /c^ metric works from soft up, 
using increases the chance that a relatively soft parton emitted in the parton shower 
will be reconstructed and pass above the px threshold. But it is possible to imagine using 
other algorithms for the small-scale reclustering, and indeed implementations using C-A 
[15] or even anti-fc^ [131 have been seen to be effective. 

The trimming algorithm is simple to state; the detailed questions arise when we 
ask how the parameters should be chosen, and in any particular application parameter 
choices should be optimized for the specific process under consideration. Typical values 
for the small angular scale range between 0.2 < Rq < 0.35; for Rq much smaller than 
Rmin = 0.2, the finite angular resolution of the calorimeter starts to introduce irregu- 
larities. Good choices for A are either the total jet pt, for dijet events or other such 
events where all jets have similar pxs, or the scalar sum transverse energy of the event, 
Ht, if jets have some spread over a broader range of pxs. Typical values for the cutoff 
parameter fcut range between 10~^ (more typically for jet pt) and 10~^ (for event Ht): 
this tends to work out to keeping subjets down to a 5 to 10 GeV threshold. 

Pruning pTl [TB] builds on the observation that the mass-drop algorithm improves 
mass resolution on boosted hard decays even before the filtering step, by discarding soft 
wide-angle radiation clustered into the fat jet at the final stages. In the C-A algorithm, 
the typical last clusterings in the fat jet are of stray soft radiation, usually unassociated 
with the parent particle, at wide angles to the jet core. These late, wide-angle clusterings 
have a disproportionate effect on jet mass. 

Pruning adapts the splitting algorithm to specifically check for soft, wide-angle split- 
tings, and throw them away. The algorithm is: 
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Given a jet J, recluster its constituents with C-A, and then sequentially unwind 
the cluster sequence. 



At each splitting P — ?■ zj, check whether the splitting is both soft, 

< Zcut, (3.24) 



mm{pT,i,PT,j] 



Pt,p 

and at wide angle, 

ARij > D,ut. (3.25) 
If so, then drop the softer of i, j, and continue unwinding the harder. 

• Stop when you find a sufficiently hard (or collinear) splitting. 

Again, this algorithm has parameters that must be optimized specifically for each process 
under consideration. Typical values of Zcut are Zcut ~ 0.1, while the radial separation 
should be tuned to the expected opening angle for a hard process, D^ut ~ 2m/pT x 1/2. 

Grooming in action. All three grooming techniques (filtering, trimming, and prun- 
ing) improve signal to background by both improving mass resolution for signal and 
suppressing QCD background. QCD jets, whose jet masses are generated by relatively 
softer and less symmetric emissions, are more likely to have their masses shifted sub- 
stantially downward by jet grooming than collimated perturbatively decaying particles 
are, thus depleting the background to high-mass searches. Both the sharp gain in signal 
mass resolution and the depletion of the high mass background can be seen in Fig. |2} 

We can also see in Fig. [2] that the different grooming techniques all act slightly differ- 
ently on background massive QCD jets[T5]. QCD jets with high masses dominantly have 
this mass generated by a relatively hard perturbative emission, which all algorithms are 
designed to retain, so performance between the different algorithms is similar. However, 
the effects of the different grooming algorithms on QCD backgrounds are still sufficiently 
distinct that some benefit can be obtained in applying multiple grooming algorithms 

m- 

At low masses, the differences between the grooming algorithms become more pro- 
nounced. QCD jets at low masses are dominated by a hard core. Filtering keeps a fixed 
number = 3 of subjets, and therefore retains relatively soft radiation. Trimming, by 
contrast, will typically drop all radiation except that within Rgub of the jet core. Pruning 
will also typically drop all but the radiation in the core, but the resolution radius D is 
set to scale like m/pj-, and therefore £) — t- as m — 0. Thus at small masses typically 
Rprune < Rsub, SO pruning acts more aggressively than trimming. 
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Figure 2: The operation of filtering (green, dotted), trimming (blue, dashed), and prun- 
ing (purple, dash-dotted) on background QCD jets (left) and boosted top jets (right). 
From Ref. [13]. 



3.2 Jet Shapes 

Another feature of the boosted Higgs analysis we saw in the previous lecture was the 
importance of jet mass, which allowed us to concentrate signal in a sharp peak on top 
of a falling background [20] • Jet mass is an example of a jet shape: a function / defined 
on a jet J that quantifies the properties of the jet without the (explicit) use of any jet 
algorithm. The approach is conceptually akin to event shapes, which allow quantitative 
study of QCD without requiring specific characterization of an event in terms of jets, 
and indeed many jet shapes are descendants of event shapes. 

Before discussing individual jet shapes, let us make two general comments. First, as 
we saw for jet mass, jet shapes are vulnerable to the inclusion of unassociated radiation, 
particularly pile-up, into jets, to a greater or lesser extent depending on the particular jet 
shape, and the sensitivity of the jet shape to unassociated radiation can be important. 
Second, one should bear in mind that any reasonable jet shape needs to be both infrared- 
and coUinear-safe. Any linear function of particles' pt is automatically safe; factorization 
theorems for other jet shapes can be proven [2T] . 

3.2.1 Radial distribution of particles within a jet 

The probability of a showering parton to emit a daughter parton depends on the running 
coupling as evaluated at the kj_ scale of the splitting. Jet shapes which measure the 
angular distribution of particles in an event are therefore measuring both the strength 
and the running of the strong coupling constant, and are classic probes of QCD. These 
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jet shapes are also sensitive to the color charge of the parent parton: since Cp < Ca, 
an initial gluon will radiate more, and at wider angles, than an initial quark. 



Jet Broadening is a classic e~^e~ observable. Given a thrust axis n, we can partition 
the particles i in an event into hemispheres according to sign(pj ■ n), which for dijet-like 
events is equivalent to associating each particle to a jet. Hemisphere broadening is then 
defined as the momentum-weighted transverse spread of the particles. 



where the sum runs over all particles z in a hemisphere H. 

Differential and Integrated Jet Shapes are, thanks to a historic quirk of nomen- 
clature, names for two specific jet shapes: the so-called differential jet shape p{r) and the 
integrated jet shape \l/(r), which characterize the radial distribution of radiation inside a 
jet. These jet shapes are also sometimes called the jet profile. Both of these shapes are 
defined on an ensemble of N jets formed with radius R. Then for r < R, the integrated 
jet shape \l/(r) is the ensemble average of the fraction of a jet's pr which is contained 
within a radius r from the jet axis. Defining as the distance of a constituent i from 
the jet axis. 



Here the second sum runs over all constituents i of a jet J. The differential jet shape 
p(r) is then given by 



These variables are often included in the suite of QCD precision measurements per- 
formed by experimental collaborations, as for instance in the ATLAS study [22], and 
are useful for validating parton shower models. 

Girth is another jet shape which probes the radial distribution of radiation inside a 
jet. Let Tj again be the distance between a constituent i and the jet axis. Then the 
girth of a jet gj is the linear radial moment of the jet. 




(3.26) 




(3.27) 






(3.29) 
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In the coUinear limit 0—7-0, girth becomes equivalent to jet broadening (where the 
thrust axis is replaced by the jet axis). Girth has been shown to be particularly useful 
for distinguishing quark-initiated jets from gluon-initiated jets [23] • 



Angularities [23] are a related family of jet shapes, defined as a function of the 
parameter a: 



^E^^-^^^'^"'^''- (3.30) 



Here r]i is the separation in rapidity only between particle i and the jet axis, and p^^i 
the momentum transverse to the jet axis. 

3.2.2 Discriminating boosted decay kinematics 

The radial distribution jet shapes discussed in the previous section are geared toward 
probing the characteristic shower structure of QCD. Here we will discuss several exam- 
ples of jet shapes which target evidence of non-QCD-like substructure in jets. 

Planar flow [21] considers the spread of the jet's radiation in the plane transverse to 
the jet axis (see also the closely related jet transverse sphericity shape [25]). Since QCD 
showers are angular-ordered, radiation subsequent to the first emission P — )■ ij tends to 
be concentrated between the clusters of energy defined by i and j, leading to a roughly 
linear distribution of energy in the jet. By contrast, boosted three-body decays, such as 
boosted tops, have a more planar distribution of energy. 
Define the tensor 

^"'-i;E'-^. (3.31) 

jG J 

where the indices a, h span the plane perpendicular to the jet axis, and pj^^ denotes the 
projection of particle z's momentum into this plane. Letting Ai, A2 be the eigenvalues 
of /"^, the planar flow of a jet is given by 

_ 4A1A2 _ detJ 

= (aT+a^ ^ Wf- ^ ' 

With this normalization, Pfj G (0,1). Monte Carlo studies have demonstrated that 
QCD events do indeed peak at low values of Pf, while boosted top decays show a 
relatively fiat distribution in Pf, but preliminary results show some sensitivity to shower 
modeling [25] and the utility of this shape in data is so far unclear. 
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Note that neither J"'' nor its eigenvalues are invariant under longitudinal boosts. For 
fully reconstructible events this is not a worry in theory, as all events can be considered 
in the reconstructed CM frame, but finite experimental resolution can become an issue 
in transforming from the lab frame into the CM frame. 

Template overlaps define jet shapes based on (aspects of) the matrix elements for 
boosted object decays [26]. For example, consider the three body top quark decay with 
intermediate on-shell W. The phase space for this decay is (in the narrow-width ap- 
proximation) determined by four parameters, which can be parameterized as the solid 
angle governing the two-body decays of both the t and its daughter W. Note that (1) 
the azimuthal angle 0t is meaningful, as the detector geometry is not invariant under 
rotations around the top direction of motion, and (2) this phase space has both rrit and 
mw built in. A series of templates describing this phase space can be generated by 
discretizing the four-dimensional space. To use these templates on a jet, the method of 
template overlaps finds the template which has best overlap with the kinematic configu- 
ration of the jet constituents according to a chosen metric. The ultimate variable is the 
numerical value of the best overlap, which distinguishes between QCD jets and boosted 
tops. 

A^-subjettiness [27] takes a different and more general approach to probing jet sub- 
structure via jet shapes. Given N axes n^, we define iV-subjettiness as 

EiejPT,i min(Ai?,fc) 

where Rq is the jet radius, and ARik is the distance between particle i and axis hk- 
The smaller tn is, the more radiation is clustered around the chosen axes, or in other 
words, smaller values of tn indicate a better characterization of the jet J as having N 
(or fewer) subjets. Conversely, if is large, then a description in terms of > subjets 
is better. 

However, as QCD alone will happily make jets with subjets, to differentiate boosted 
objects we need to probe not just the possible existence of subjets, but their structure. 
The real distinguishing power of A^-subjettiness occurs when looking at ratios. For 
instance, a two-prong boosted particle such as a Higgs or W will have large ri and 
small T2. QCD jets which have small T2 will generically have smaller ti than for signal, 
as the QCD jets are more hierarchical; conversely, QCD jets which have large ri are 
generally diffuse, and will have larger T2 as well than for signal. Thus the best single 
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discriminating variable is T2/T1, or, more generally 



Tn = 



TN-1 



(3.34) 



for a boosted A^-prong particle. 

The question of how to determine the input subjet axes is an interesting one. One 
approach, which is fast and perfectly serviceable for most applications, is to use a jet 
algorithm, such as exclusive kx-, to determine subjet axes. Naturally, the results then 
retain some dependence on the choice of jet algorithm used to find the axes. Another 
approach is to marginalize over all possible choices of fik-, and choose the set which 
minimizes ttv [2H]- While this choice is computationally more intensive, it removes the 
dependence on the jet algorithm choice, and additionally guarantees the nice property 
that 



which holds only approximately if fixed subjet axes are used. 

A^-subjettiness is a conceptual descendent of the event shape A^-jettiness [29], which 
classifies events as being A^-jet-like without reference to jet algorithms. 

3.2.3 Color flow variables 

Beyond kinematics, boosted perturbative decays can also differ from QCD backgrounds 
in their color structure. Consider a color singlet such as a or boson decaying to 
a quark-antiquark pair. The daugher quark jets form a color dipole: they are color- 
connected to each other, but not to the rest of the event. Meanwhile, the backgrounds 
to these processes come from QCD dijets, which necessarily have different color connec- 
tions, as we show in Fig. |3| where the radiation patterns for a color-singlet signal are 
plotted on the left and for a typical background on the right, as computed in the eikonal 
(soft) approximation. This observation has motivated work on variables which can add 
color flow to the suite of features which can discriminate signal from background. 

Jet pull [30] defines for each jet a transverse vector tj characterizing the net direc- 
tional distribution of the soft radiation surrounding the jet core. Defining fi as the 
(transverse) direction of particle i from the jet axis, the pull vector is 



The direction of tj relative to other jets in the event then is sensitive to the color 
connection of the jet J. Two jets which are color-connected to each other will have pull 



(3.35) 




(3.36) 
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Figure 3: Radiation patterns in the eikonal approximation for two triplet color sources 
color-connected to each other (left) and to the beam (right). Contours are logarithmic, 
and the scales in the two figures are not the same. 



vectors pointing toward each other. Jets which are color-connected to the beam will 
have pull vectors pointing toward the beam. Once two interesting (sub)jets have been 
identified, the discriminating variable is then cos 6t, the angle between the pull vector 
and the line connecting the two (sub)jet cores. An initial experimental study of pull has 
been carried out at DO, using the W in top events 



Dipolarity [32] is a jet shape which is designed to test for color dipole-like structure 
when the apparent particle is boosted and the two (sub)jets of interest are geometrically 
nearby. Since pull scales like rf, it can be unduly sensitive to the detailed assignment 
of particles between the two (sub)jet cores to one or the other of the two (sub)jets. 
Dipolarity therefore uses as the relevant distance measure Ri, the transverse distance of 
particle i to the line segment connecting the (sub)jet cores. 

Note that dipolarity requires input (sub)jet axes. The major application studied to date 
has been in boosted top tagging, where dipolarity can improve the identification of the 
boosted daughter W. 

Keeping the right soft radiation. We have emphasized the need for jet grooming 
tools in the busy, high luminosity environment of the LHC However, that grooming will 
groom away most if not all of the information about color flow. To use the information 
contained in an event's color flow, it is necessary to retain at least some of the soft 
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radiation. Exactly which soft radiation is included, and at which stage in the analysis, 
is a question which has to be addressed case-by-case. As an example, we will discuss how 
the dipolarity shape can be incorporated into a boosted top tagger |32]- Top tagging 
will be discussed at length in the next section; for the moment, it suffices to think of a 
top tagger as an algorithmic black box which acts on a fat jet to return candidate b, ji, 
and j2 subjets, and discards some radiation in the process. 

The returned subjet axes define the characteristic opening scale, R12, and provide 
the input axes for the dipolarity jet shape. As the top-tagger has discarded some of the 
radiation associated with the top quark in identifying the candidate subjets, to evaluate 
dipolarity we will need to go back to the original fat jet and include a larger subset 
of particles. Clearly, the radiation we'd like to include when evaluating the dipolarity 
of the candidate W daughters is only that associated with the two light quark jets; 
including radiation originating from the b would just skew the results. Let us consider 
only moderately boosted tops, such that the b jet is not overlapping with the other two. 
From the angular-ordered property of QCD showers, we know that in top events, all 
radiation associated with either light quark must be at angular separations less than 
the opening angle of the dipole, AR < R12. Thus, all radiation from the W is contained 
in cones of radius R12 around each light quark jet. The authors of Ref. [32] find that 
keeping all radiation within these two cones is casting too wide a net, however, and a 
smaller cone size of R12/V2 is a better tradeoff between keeping all the radiation from 
the W and avoiding pollution from pileup, underlying event, and splash-in from the 
nearby b. 

Color fiow variables capture a genuine physical difference between signal and back- 
ground. They have been shown, in theoretical work, to make a sizeable impact in signal 
significance [ISl [321 [33] , and show great promise as tools to expand our understanding 
of SM and BSM physics. It is important to bear in mind, however, that these "proof of 
principle" analyses have all been performed using shower Monte Carlos, which capture 
only leading approximations to the full QCD dynamics. Just as the jet shapes discussed 



in section [3.2. 1| above have been and are still important tools for assessing the validity 
of the approximations made in the Monte Carlo generators, measuring and calibrating 
color fiow variables in data is critical to understand the validity of the shower models 
and the performance of any color fiow variable. This experimental program is, as of 
yet, in its infancy. In the meantime, theoretical studies should bear this uncertainty 
in mind. To estimate the uncertainties, it is useful (as it is for any novel substructure 
variable) to check results using more than one shower model. 
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4 Lecture III: Top tagging and searches for physics 
BSM 



In this section we will assemble the tools and techniques developed in the previous 
two sections and apply them to searches for physics beyond the standard model. By 
far the most universally motivated application of jet substructure techniques to BSM 
physics is in the hunt for TeV-scale new states which decay to electroweak-scale SM 
particles. The best reason for new physics to live anywhere near the weak scale is that 
it is partially responsible for the generation of the electroweak scale. New physics that 
is related to EWSB will naturally couple most strongly to those particles in the SM 
which feel EWSB most strongly, in particular the top quark and the EW bosons {H, 
W, and Z), and thus will decay preferentially to these heavy particles rather than to the 
light quarks and leptons which yield simpler final states. Moreover, we have compelling 
reasons to believe new physics will naturally decay to boosted SM particles. Even before 
the LHC turned on, the lack of deviations from SM predictions for flavor or precision 
electroweak observables already hinted that the likely scale for new physics was not vew 
as naturalness might have suggested, but rather A > few TeV. Evidence for this "little 
hierarchy" problem has of course only gotten stronger as the LHC has directly explored 
physics at TeV scales. Thus many models which address the stabilization of the EW 
scale will naturally give rise to final states rich in boosted tops, Higgses, W^s and Z's. 

In this section we will provide an introduction to top tagging at the LHC, followed 
by a few brief concluding comments on searching for more general BSM physics with 
jets. 

4.1 Top Tagging 

As we established in Section 1, top pair production at the LHC covers a broad range 
of kinematic regimes interpolating between threshold {^/s = 2mj), where tops are well 
described cLS cL SIX- object final state, up to TeV-scale energies, where the tops are highly 
collimated and are best described as a two-object final state. Top reconstruction must 
thus be able to flexibly cover a wide range of kinematic scenarios. In the interest of time, 
we will restrict our attention here to top taggers which target the hadronic decay of the 
top quark, although the semi-leptonic decay mode also requires interesting techniques 
for identification and reconstruction 

As for jet algorithms, the "best" top tagger depends on the question being asked. In 
particular, different strategies are required at high px rrit) versus moderate pt 



24 



rrit). Another question is: what signal efficiency is necessary? Every tagging technique 
trades off signal efficiency against background mistag rate. Depending on the search in 
question, the composition of the backgrounds will change, and therefore the necessary 
mistag rate will shift as well. For example, consider a top pair event with at least one 
boosted hadronic top. If the other top is also hadronic, then QCD dijets are by far the 
dominant background, and small QCD mistag rates are required. But if the other top 
is leptonic, then W+ jets becomes an important background, and if the top is produced 
in association with some new physics objects, such as I^t, then the backgrounds may 
be substantially smaller, and mistag rates may be entirely unimportant. 

The aim of this section is to provide an introduction to top tagging by discussing 
a representative variety of top taggers. Specifically, we will consider the top taggers 
currently used by both LHC experiments, which work best in the highly boosted regime; 
the "HEP top tagger", which targets moderate px] and top tagging with A^-subjettiness. 

4.1.1 CMS top tagger 

The hadronic top tagger used by CMS is largely based on the "Hopkins" top tagger 
[57| . It builds on the techniques of the boosted Higgs "splitting/filtering" or "mass 
drop" analysis, which we discussed in Section 1. Thus, we again begin by clustering 
the event using the C-A algorithm, on large angular scales, capturing all of the top 
decay products in a single fat jet, which we will then unwind until we find interesting 
substructure. Compared to the Higgs analysis, there are two important differences. 
First, we are looking for at least three hard subjets, instead of two. Second, we take 
the fat jet radius to be noticeably smaller than we did for the Higgs case: R = 0.8. 
Using our rule of thumb, R ~ 2pT/m, this means we are targeting tops with px ^ 500 
GeV: appropriate for production from a TeV-scale resonance. Contrast this with the 
boosted Higgs, which was targeting the high-p^ tails of SM associated production, where 
requiring large px imposed a significant price in signal acceptance. 

Iteratively declustering the fat jet, we encounter splittings P — )■ ij. Our criterion for 
an interesting splitting is simply that both daughter subjets must carry a sufficiently 
large fraction of the total fat jet momentum, 

PT,j > 5pPt,j (4.38) 

for some parameter 6p. If a splitting fails to meet this criterion, discard the softer of 
i, j, and continue to unwind the harder. The splitting is rejected if it is too collinear, 
|A?7jj| + |A(/)jj| > Sr, for another parameter 6^. This procedure stops when either both 
i, j are softer than SpPt,j, or only one particle is left. 
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Figure 4: Leading order distributions of m^j (blue) and mi,u (purple) in unpolarized top 
decay. 

If an interesting hard, non-coUinear splitting P — )■ jij2 is found, then the next step 
is to successively unwind both ji and j2 according to the same algorithm, in search of 
further interesting splittings. This procedure returns a set of 2, 3, or 4 subjets. Fat 
jets returning only 2 subjets don't have enough substructure to be good top candidates, 
and are rejected. Jets which return 3 or 4 subjets do show enough substructure to be 
interesting, and the next step is to test whether or not they also have top-like kinematics. 

As for the Higgs, the single most important discriminator is the jet mass. CMS 
requires that the jet mass, as computed from the sum of the returned subjets, lie within 
a top mass window, nit ~ 75 GeV < nij < rrit + 75 GeV. 

The onshell decay of the W inside the jet will also help us separate signal from back- 
ground, but rather than trying to explicitly identify a pair of subjets which reconstruct 
a W — a procedure highly vulnerable to the misassignment of particles in overlapping 
jets — we will exploit the presence of the W mass scale in a less direct way. 

The pairwise invariant masses of all possible combinations of the three daughter 
quarks are all governed by the mass scales in the top matrix element, and m^. 
The distribution of the invariant mass of the b and the d-tjpe quark (equivalent to the 
charged lepton in leptonic top decay) is shown in Fig. |4j The most likely value of m^j is 
approximately 115 GeV. The invariant mass of the b and the w-type quark (equivalent 
to the neutrino) is peaked at even larger values. By contrast, subjet masses from QCD 
background processes are hierarchically smaller than the total parent jet mass. Thus 
instead of trying to reconstruct the W, we simply require that the minimum of the 
invariant masses formed from pairs of the three hardest subjets be sufficiently large to 
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reject backgrounds, 

min(mi2, mi3, 77223) > 50 GeV. (4.39) 

These cuts on masses, together with the substructure requirement, constitute the tag- 
ger. Note that no 6-tagging information is used. Tagging 6-jets is very difficult in this 
environment for two reasons. First, the b is embedded in a highly coUimated top, so 
disentangling the tracks that are associated with the b from the other tracks in the jet is 
challenging. Second, the b itself is at very high p^, so the opening angles of its daughter 
products are small, and it is difficult to get sufficient resolution from the reconstructed 
tracks to reconstruct the displaced vertex. Note also that the tagger doesn't require jet 
grooming. This is partly because the iterative decomposition procedure is performing 
some of that function in its own right, as it discards soft wide-angle radiation in the 
process of finding hard subjets (compare pruning). The smaller geometric size of the 
fat jets also means that pollution is not as large an effect. 

4.1.2 ATLAS top tagger 

We turn next to the ATLAS top tagger. Like CMS' tagger, it is optimized for high 
Pt, and like CMS' tagger, it is based on iterative declustering of a sequential algorithm. 
However, the ATLAS tagger draws on a very different set of ideas, largely based on work 
by Ref. |3H] and the "Y-splitter" of Ref. [39]. 

The ATLAS top tagger begins by clustering events using the anti-fcr algorithm with 
R = 1.0. (The slightly larger jet radius means that this tagger works best at slightly 
lower Pt than does the CMS tagger.) Since the anti-A;T algorithm knows nothing about 
the singularity structure of QCD, its use is simply to identify a nicely regular initial set 
of particles. The next step is to take this set of particles and recluster them using the 

algorithm. 

Recall that the algorithm preferentially clusters soft splittings. This means that 
the hardest splittings in the jet are the very last ones. Thus, there is no need to do any 
preliminary unwinding, and the existence of hard substructure is directly reflected in 
the hardness of the scales given by the /ct metric evaluated on the last few splittings in 
the jet: 

dij = mm{pl„pl^)ARl. (4.40) 

Large splitting scales mean the emissions are both hard and at wide angles. The ATLAS 
tagger uses as inputs the splitting scales of the last three recombinations, di2, ^23, and 
(i34. The ffist two splittings correspond (usually) to the identification of the three 
daughter partons, and the third to possible FSR from one of the partons. Since for tops 
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the splitting ^34 is the first which comes from the QCD shower, its scale can still be 
relatively large; on the other hand, for background QCD jets, the hierarchical nature 
of the shower means that generally ^34 <^ 6/23 ^ di2- Thus cuts on d^^ maintain some 
discriminating power. 

However, instead of cutting directly on the massive splittings dij, it is advantageous 
to change variables to a set which are less correlated with the jet and sub jet invariant 
masses [25]. We define the energy sharing variables 

Zii = , ^ ~ ^ (4.41) 

where in the last step we have taken the coUinear limit (and pT,i > Ptj)- Notice that by 
performing this change of variables we have removed sensitivity to the collinear singu- 
larity, so that Zij is only capturing information about the soft singularity. Meanwhile, 
jet invariant masses still retain information about the relative angles between the jets, 
so the correlation between the variables has been reduced. 

The final set of variables that make up the ATLAS top tagger is then: 

• The total jet mass, nij. The tagger requires mj > 140 GeV, and no upper bound: 
no grooming procedure is used, so the mass spectrum is distorted upwards. 

• The variable Qw, defined as the minimum pair invariant mass of the three sub- 
jets identified at the splitting scale ^23- This is the equivalent to cutting on the 
minimum pair invariant mass in the CMS tagger; only the method of finding the 
sub jets is different. We require Qw > 50 GeV. 

• All three energy sharing variables, 2:12, 2:13, and Z23, which are subject to numerical 
cuts. 



4.1.3 HEP top tagger 

We turn now to the Heidelberg-Eugene-Paris top tagger, which functions on tops with 
Pt ^ 200 GeV [101 SI] ■ In some sense this algorithm is more of an event reconstruction 
strategy than a top tagger. The algorithm begins by clustering the event using C-A on 
the extremely large angular scale R = 1.5, and requiring the fat jets thus formed to have 
Pt > 200 GeV. The pr cut of 200 GeV puts us in the regime where the top is sufficiently 
boosted that its decay products will frequently lie in a single hemisphere. Looking at 
extremely fat jets is effectively identifying hemispheres in an event while avoiding the 
need to set any fixed angular scales for resolution within those hemispheres. This is 
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an effective strategy for tops in tliis intermediate kinematic regime, wliere events will 
straddle any fixed angular scale; by unwinding C-A hemispheres, we allow the angular 
scales to be flexibly identified event by event. 

The next step is to unwind the fat jet looking for interesting hard structure. This is 
done by employing a (loose) mass-drop criterion. For a splitting P ^ ij, with rrij < rrii, 
the splitting is deemed sufficiently interesting if 

mj>0.2mp. (4.42) 

If the splitting passes this criterion, retain both i and j in the list of jets to unwind; 
otherwise, discard j and keep unwinding i until rrii < 30 GeV, at which point the 
unwinding stops. This unwinding procedure is performed on all subjets identified as 



interesting via Eq. 4.42 The output of this step is a list of subjets {ji} resulting from 
this iterative declustering; if there are at least 3 such subjets, then we have found enough 
substructure to continue. 

At the next stage, we filter the substructures to shrink the geometric area associated 
with the top daughters and thereby reduce sensitivity to pileup, etc. Unlike in the Higgs 
case, where the mass drop criterion identified a unique angular scale Rf,i associated 
with the sole hard splitting, we have a more complicated set of jets with more than 
one interesting splitting, and it is not immediately obvious which angular scale should 
be used to filter the event. The HEP top tagger determines the filter radius Rfm 
by brute force, as follows. For each possible set of three subjets that can be drawn 
from the {ji}, filter them by resolving the constituents of those subjets with radius 
Rfiit = min(0.3, Ai?jj), and retain up to five subjets. Let mfut be the invariant mass 
of these up-to-five filtered subjets, and select the set with nij closest to nit as the top 
candidate. These up-to-five filtered subjets are then (yet again) reclustered into three 
subjets, which are the candidates for the partonic top daughters. 

The next step is to test whether or not the reconstructed top daughters have top- 
like kinematics. Again, we will exploit the presence of both the top and W mass scales. 
We have already used rrit to identify the best set of subjets. Unlike in the previous 
taggers, we will now demand evidence of the on-shell in a more complex way. Label 
the three subjets returned by the previous step as {ji, 72,^3} in descending order of 
Pt- Of the three invariant masses mi2, rriis, and m23, only two are independent. This 
means that the top kinematics is characterized by a specific distribution in the two- 
dimensional space determined by the pair invariant masses. Top jets are focused into 
a thin triangular annulus in this space, as two subjets reconstruct an on-shell W (the 
annulus is triangular since any of the rriij may correspond to the W). Background, by 
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Figure 5: HEP top tagger efficiencies on top quarks for: all top decay products within 
AR = 1.5 of each other (blue, dashed); all top decay products clustered into R = 1.5 
C-A fat jets (purple, dotted); tagged by the HEP top tagger (red, solid); tagged, but 
with reconstructed subjects not matching original partons (green, dot-dashed). Data 
from Ref . [H] . 

contrast, is concentrated in regions of small pairwise invariant masses. The kinematic 
cuts imposed in the HEP top tagger pick out this top-like triangular annulus by asking 
that events lie on one of the three branches of the annulus. 

To understand how well this procedure covers the interpolating kinematic region, we 
plot efficiencies for tops to pass through these steps in Fig. |5| As is evident from the 
blue (dashed) curve, simply demanding that all decay products of the top lie in a single 
hemisphere imposes a non-negligible acceptance price for tops at the low end of the 
Pt range, which drops quickly as the tops become more energetic. Further demanding 
that the top daughters all be clustered into the same fat jet results in an additional mild 
efficiency loss, seen in the purple (dotted) curve. The purple curve is the fraction of tops 
giving rise to taggable jets (neglecting the possibility of mistagged signal). The red line 
denotes the final efficiency of the full HEP top tagger, after the ffitering and kinematic 
cuts. At low Pt, the fraction of taggable jets which are in fact tagged is near unity, 
but as the tops become more collimated, the probability of a taggable jet passing the 
kinematic cuts falls off, in large part because collimation and jet-particle misassignment 
make the W mass reconstruction less precise. At the upper end of the pt range shown in 
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the figure, the high-p-r top taggers are useful, and would take over. Let us also comment 
that there is a possibility for tops to pass the top tagger by accident, when the algorithm 
picks up the wrong set of jets; this is shown in the green (dot-dashed) curve. 



4.1.4 A^-subjettiness 



As we saw in section 3.2.2, A^-subjettiness offers an entirely complementary test of 
the existence of hard substructure. A simple and highly effective top tagger can be 
constructed using as input variables just the jet mass and the ratio t-^It^. Further 
refinement is possible with a multivariate analysis which uses in addition t^Itx as well 
as Ti, r2, and individually [25] . 

From our experience with the previous taggers, we can guess that even further im- 
provement would be possible if some information about the W were also incorporated; 
since the A^-subjettiness jet shape also provides a method of determining subjet axes, it 
naturally suggests methods for defining three subjets and computing the analog of Q-^. 
To the best of the author's knowledge, no such study has been publicly performed. 



4.1.5 Top tagging performance 

Let us now consider the performance of the top taggers which we have discussed. This 
task is made easier by the work performed in the BOOST 2010 [15] and BOOST 2011 
[l2] workshops, which compared the performance of different top taggers on the same 
reference sets of event samples. These event samples are publicly available online, so 
should you develop your own brilliant ideas about top tagging, you can cross-check the 
performance of your novel technique with the techniques already in the literature. In 
Fig. [6]we show performance curves for the high-p-p top taggers which we discussed above. 
Overall, these high-pj^ top taggers have efficiencies on the order of e ~ 50%, at a (QCD) 
background mistag rate of efake ~ 5%. (For comparison, LHC 6-tagging algorithms 
achieve e ~ 70%, with a fake rate e^ake ~ 1%-) It is evident from the performance 
curves that the ATLAS tagger outperforms the CMS tagger when high signal efficiency is 
required, while CMS does better at lower signal efficiency. Even the simple two-variable 
A^-subjettiness tagger outperforms both CMS and ATLAS taggers by a notable margin, 
except at high signal efficiency, while adding the additional multivariate discrimination 
to the A^-subjettiness tagger provides a significant improvement. Further updates in the 
BOOST 2011 workshop show that (1) being more precise about modeling QCD radiation 
at wide angles and (2) including the effects of finite detector resolution reduce typical 
efficiencies to e ~ 40%, at a (QCD) background mistag rate in the range e^ake ~ 2 — 8% 
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Figure 6: Top tagging performance curves, for tops with 200 GeV < pr < 800 GeV in 
the BOOST 2010 reference samples for: ATLAS (blue, dotted); CMS (red, dashed); and 
A^-subjettiness in both the simple (green, solid) and multivariate (green, dot-dashed) 
versions. Data from Refs. fTEj fIE\ . 

depending on the tagger |12]. Incorporating finite detector resolution also tends to 
reduce (but not erase!) the relative advantage of A^-subjettiness over the sequential 
decomposition-based taggers. 

4.2 BSM searches with jet substructure 

New physics produces jets with substructure when the kinematics are governed by a 
nontrivial hierarchy of scales. For the top examples we've been discussing, this hier- 
archy arises from the separation between the scale characterizing new physics and the 
electroweak scale: 

Anp > Aew > Aqcd- (4.43) 

The little hierarchy problem results in a very strong motivation for developing tagging 
techniques for boosted SM objects. Besides the top tagging discussed in the previous 
subsection, much effort has also gone into tagging boosted W, Z, and H bosons arising 
from the decay of new TeV-scale particles|l3l HH SHI HE]. This is fortunate for theorists, 
as, once these techniques are put into use at experiments in one context, the barrier is 
much lower for their adaptation in other contexts where the theoretical motivation may 
not be so universal. 
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What other kinds of BSM physics are amenable to substructure analyses? To en- 
gender events with interesting substructure, some multi-tiered hierarchy of scales is 
required. We will enumerate an illustrative but far from exhaustive set of examples. 

Supersymmetry is one example of a new physics sector which naturally can generate 
multiple scales. For example, if supersymmetry is broken at very high scales, RG effects 
will drive the colored superpartners much heavier than those superpartners with only 
EW charges. Thus, at the weak scale one could naturally expect Mg ^ M^o. In the 
presence of a large hierarchy between gluino and neutralino, the decay products of the 
neutralino would be coUimated. Let us further suppose that the neutralino decays via 
the i?-parity violating udd superpotential operator, so that x° — qqq. Then gluino 
pair production would appear as a six-jet final state, where two of the jets are actually 
boosted neutralinos, containing interesting substructure |17]. The very large particle 
content of the MSSM can easily accommodate many possible hierarchies, with different 
theoretical origins; see for instance Ref. |18] for another of the many possibilities. 

Another way to generate a hierarchy in a BSM sector is if the new physics sector 
contains a broken global symmetry, so that the scale A*^^) characterizing the lightest 
states is set by the magnitude of the global symmetry breaking, rather than by the 
overall scale of the new sector, A^^-* ^ A^^\ Thus consider, for example, a composite 
rho pc, decaying into two pseudo-Nambu-Goldstone bosons tcc, which are stable within 
their own sector, and therefore must subsequently decay into SM objects [49j. 

Hidden valley models also have this kind of multi-scale structure. Here the hierarchy 
is between the mass of the mediator which connects the visible and hidden sectors and 
the mass scale of the light states in the hidden sector, 

Amed > A^p > Asm- (4.44) 

The mediating particle might be a SM particle, in particular the H or Z, or a novel 
field such as a Z' [5^ or the SM LSP [51]. Exotic Higgs decays to light particles also 
fall under this umbrella [521 [531 [Ml EH [56] . 

More generally, thinking more broadly and flexibly about jets leads to new ap- 
proaches to combinatorics and event reconstruction [57], and provides novel methods 
to distinguish QCD events from new physics. As challenging high-multiplicity and 
all-hadronic final states become a larger component of the LHC program, flexible and 
creative jet techniques will be critical to our ability to discover and interpret the physics. 
Jet algorithms themselves are still an evolving field! The anti-fc^ algorithm was intro- 
duced only a few years ago. As the nature of the questions that we ask about jets 
evolves, so do the best jet algorithms to address these questions. There is still a lot of 
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room for new ideas! 



5 Further Reading 

References which were invaluable in the preparation of these lectures are the text QCD 
and Collider Physics, by Ellis, Stirling, and Webber [Tl], and the lecture notes "Toward 
Jetography" by Salam [58]. The proceedings of the BOOST 2010 and 2011 workshops 
[T5| 112] are valuable resources for those looking for a quantitative survey of both theo- 
retical and experimental progress in jet physics at the Tevatron and the LHC 
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